Renewable energy sources play an increasingly important role in the global energy mix, as the effort to reduce the environmental impact of energy production increases.
Out of all the renewable energy alternatives, wind energy is one of the most developed technologies worldwide. The U.S Department of Energy has put together a guide to achieving operational efficiency using predictive maintenance practices.
Predictive maintenance uses sensor information and analysis methods to measure and predict degradation and future component capability. The idea behind predictive maintenance is that failure patterns are predictable and if component failure can be predicted accurately and the component is replaced before it fails, the costs of operation and maintenance will be much lower.
The sensors fitted across different machines involved in the process of energy generation collect data related to various environmental factors (temperature, humidity, wind speed, etc.) and additional features related to various parts of the wind turbine (gearbox, tower, blades, break, etc.).
“ReneWind” is a company working on improving the machinery/processes involved in the production of wind energy using machine learning and has collected data of generator failure of wind turbines using sensors. They have shared a ciphered version of the data, as the data collected through sensors is confidential (the type of data collected varies with companies). Data has 40 predictors, 20000 observations in the training set and 5000 in the test set.
The objective is to build various classification models, tune them, and find the best one that will help identify failures so that the generators could be repaired before failing/breaking to reduce the overall maintenance cost. The nature of predictions made by the classification model will translate as follows:
It is given that the cost of repairing a generator is much less than the cost of replacing it, and the cost of inspection is less than the cost of repair.
“1” in the target variables should be considered as “failure” and “0” represents “No failure”.
# Libraries to help with reading and manipulating data
import numpy as np
import pandas as pd
# Libraries to help with data visualization
import matplotlib.pyplot as plt
import seaborn as sns
# To tune model, get different metric scores and split data
from sklearn.model_selection import RandomizedSearchCV
from sklearn.model_selection import train_test_split, StratifiedKFold, cross_val_score
from sklearn.metrics import f1_score,accuracy_score,recall_score,precision_score
from sklearn import metrics
# To build a logistic regression model
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier, BaggingClassifier, AdaBoostClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
# To oversample and undersample data
from imblearn.over_sampling import SMOTE
from imblearn.under_sampling import RandomUnderSampler
#For making pipelines
from sklearn.impute import SimpleImputer
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
# To suppress the warnings
import warnings
warnings.filterwarnings("ignore")
#Loading the orginal training dataset
org_data = pd.read_csv('/Users/anshamohammed/Desktop/Drive G/specialised course/Feature_eng/Project/Train.csv.csv')
org_data
| V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 | ... | V32 | V33 | V34 | V35 | V36 | V37 | V38 | V39 | V40 | Target | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -4.464606 | -4.679129 | 3.101546 | 0.506130 | -0.221083 | -2.032511 | -2.910870 | 0.050714 | -1.522351 | 3.761892 | ... | 3.059700 | -1.690440 | 2.846296 | 2.235198 | 6.667486 | 0.443809 | -2.369169 | 2.950578 | -3.480324 | 0 |
| 1 | 3.365912 | 3.653381 | 0.909671 | -1.367528 | 0.332016 | 2.358938 | 0.732600 | -4.332135 | 0.565695 | -0.101080 | ... | -1.795474 | 3.032780 | -2.467514 | 1.894599 | -2.297780 | -1.731048 | 5.908837 | -0.386345 | 0.616242 | 0 |
| 2 | -3.831843 | -5.824444 | 0.634031 | -2.418815 | -1.773827 | 1.016824 | -2.098941 | -3.173204 | -2.081860 | 5.392621 | ... | -0.257101 | 0.803550 | 4.086219 | 2.292138 | 5.360850 | 0.351993 | 2.940021 | 3.839160 | -4.309402 | 0 |
| 3 | 1.618098 | 1.888342 | 7.046143 | -1.147285 | 0.083080 | -1.529780 | 0.207309 | -2.493629 | 0.344926 | 2.118578 | ... | -3.584425 | -2.577474 | 1.363769 | 0.622714 | 5.550100 | -1.526796 | 0.138853 | 3.101430 | -1.277378 | 0 |
| 4 | -0.111440 | 3.872488 | -3.758361 | -2.982897 | 3.792714 | 0.544960 | 0.205433 | 4.848994 | -1.854920 | -6.220023 | ... | 8.265896 | 6.629213 | -10.068689 | 1.222987 | -3.229763 | 1.686909 | -2.163896 | -3.644622 | 6.510338 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 19995 | -2.071318 | -1.088279 | -0.796174 | -3.011720 | -2.287540 | 2.807310 | 0.481428 | 0.105171 | -0.586599 | -2.899398 | ... | -8.273996 | 5.745013 | 0.589014 | -0.649988 | -3.043174 | 2.216461 | 0.608723 | 0.178193 | 2.927755 | 1 |
| 19996 | 2.890264 | 2.483069 | 5.643919 | 0.937053 | -1.380870 | 0.412051 | -1.593386 | -5.762498 | 2.150096 | 0.272302 | ... | -4.159092 | 1.181466 | -0.742412 | 5.368979 | -0.693028 | -1.668971 | 3.659954 | 0.819863 | -1.987265 | 0 |
| 19997 | -3.896979 | -3.942407 | -0.351364 | -2.417462 | 1.107546 | -1.527623 | -3.519882 | 2.054792 | -0.233996 | -0.357687 | ... | 7.112162 | 1.476080 | -3.953710 | 1.855555 | 5.029209 | 2.082588 | -6.409304 | 1.477138 | -0.874148 | 0 |
| 19998 | -3.187322 | -10.051662 | 5.695955 | -4.370053 | -5.354758 | -1.873044 | -3.947210 | 0.679420 | -2.389254 | 5.456756 | ... | 0.402812 | 3.163661 | 3.752095 | 8.529894 | 8.450626 | 0.203958 | -7.129918 | 4.249394 | -6.112267 | 0 |
| 19999 | -2.686903 | 1.961187 | 6.137088 | 2.600133 | 2.657241 | -4.290882 | -2.344267 | 0.974004 | -1.027462 | 0.497421 | ... | 6.620811 | -1.988786 | -1.348901 | 3.951801 | 5.449706 | -0.455411 | -2.202056 | 1.678229 | -1.974413 | 0 |
20000 rows × 41 columns
#Making a copy
df = org_data.copy()
org_data
| V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 | ... | V32 | V33 | V34 | V35 | V36 | V37 | V38 | V39 | V40 | Target | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -4.464606 | -4.679129 | 3.101546 | 0.506130 | -0.221083 | -2.032511 | -2.910870 | 0.050714 | -1.522351 | 3.761892 | ... | 3.059700 | -1.690440 | 2.846296 | 2.235198 | 6.667486 | 0.443809 | -2.369169 | 2.950578 | -3.480324 | 0 |
| 1 | 3.365912 | 3.653381 | 0.909671 | -1.367528 | 0.332016 | 2.358938 | 0.732600 | -4.332135 | 0.565695 | -0.101080 | ... | -1.795474 | 3.032780 | -2.467514 | 1.894599 | -2.297780 | -1.731048 | 5.908837 | -0.386345 | 0.616242 | 0 |
| 2 | -3.831843 | -5.824444 | 0.634031 | -2.418815 | -1.773827 | 1.016824 | -2.098941 | -3.173204 | -2.081860 | 5.392621 | ... | -0.257101 | 0.803550 | 4.086219 | 2.292138 | 5.360850 | 0.351993 | 2.940021 | 3.839160 | -4.309402 | 0 |
| 3 | 1.618098 | 1.888342 | 7.046143 | -1.147285 | 0.083080 | -1.529780 | 0.207309 | -2.493629 | 0.344926 | 2.118578 | ... | -3.584425 | -2.577474 | 1.363769 | 0.622714 | 5.550100 | -1.526796 | 0.138853 | 3.101430 | -1.277378 | 0 |
| 4 | -0.111440 | 3.872488 | -3.758361 | -2.982897 | 3.792714 | 0.544960 | 0.205433 | 4.848994 | -1.854920 | -6.220023 | ... | 8.265896 | 6.629213 | -10.068689 | 1.222987 | -3.229763 | 1.686909 | -2.163896 | -3.644622 | 6.510338 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 19995 | -2.071318 | -1.088279 | -0.796174 | -3.011720 | -2.287540 | 2.807310 | 0.481428 | 0.105171 | -0.586599 | -2.899398 | ... | -8.273996 | 5.745013 | 0.589014 | -0.649988 | -3.043174 | 2.216461 | 0.608723 | 0.178193 | 2.927755 | 1 |
| 19996 | 2.890264 | 2.483069 | 5.643919 | 0.937053 | -1.380870 | 0.412051 | -1.593386 | -5.762498 | 2.150096 | 0.272302 | ... | -4.159092 | 1.181466 | -0.742412 | 5.368979 | -0.693028 | -1.668971 | 3.659954 | 0.819863 | -1.987265 | 0 |
| 19997 | -3.896979 | -3.942407 | -0.351364 | -2.417462 | 1.107546 | -1.527623 | -3.519882 | 2.054792 | -0.233996 | -0.357687 | ... | 7.112162 | 1.476080 | -3.953710 | 1.855555 | 5.029209 | 2.082588 | -6.409304 | 1.477138 | -0.874148 | 0 |
| 19998 | -3.187322 | -10.051662 | 5.695955 | -4.370053 | -5.354758 | -1.873044 | -3.947210 | 0.679420 | -2.389254 | 5.456756 | ... | 0.402812 | 3.163661 | 3.752095 | 8.529894 | 8.450626 | 0.203958 | -7.129918 | 4.249394 | -6.112267 | 0 |
| 19999 | -2.686903 | 1.961187 | 6.137088 | 2.600133 | 2.657241 | -4.290882 | -2.344267 | 0.974004 | -1.027462 | 0.497421 | ... | 6.620811 | -1.988786 | -1.348901 | 3.951801 | 5.449706 | -0.455411 | -2.202056 | 1.678229 | -1.974413 | 0 |
20000 rows × 41 columns
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 20000 entries, 0 to 19999 Data columns (total 41 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 V1 19982 non-null float64 1 V2 19982 non-null float64 2 V3 20000 non-null float64 3 V4 20000 non-null float64 4 V5 20000 non-null float64 5 V6 20000 non-null float64 6 V7 20000 non-null float64 7 V8 20000 non-null float64 8 V9 20000 non-null float64 9 V10 20000 non-null float64 10 V11 20000 non-null float64 11 V12 20000 non-null float64 12 V13 20000 non-null float64 13 V14 20000 non-null float64 14 V15 20000 non-null float64 15 V16 20000 non-null float64 16 V17 20000 non-null float64 17 V18 20000 non-null float64 18 V19 20000 non-null float64 19 V20 20000 non-null float64 20 V21 20000 non-null float64 21 V22 20000 non-null float64 22 V23 20000 non-null float64 23 V24 20000 non-null float64 24 V25 20000 non-null float64 25 V26 20000 non-null float64 26 V27 20000 non-null float64 27 V28 20000 non-null float64 28 V29 20000 non-null float64 29 V30 20000 non-null float64 30 V31 20000 non-null float64 31 V32 20000 non-null float64 32 V33 20000 non-null float64 33 V34 20000 non-null float64 34 V35 20000 non-null float64 35 V36 20000 non-null float64 36 V37 20000 non-null float64 37 V38 20000 non-null float64 38 V39 20000 non-null float64 39 V40 20000 non-null float64 40 Target 20000 non-null int64 dtypes: float64(40), int64(1) memory usage: 6.3 MB
df.describe().T
| count | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|
| V1 | 19982.0 | -0.271996 | 3.441625 | -11.876451 | -2.737146 | -0.747917 | 1.840112 | 15.493002 |
| V2 | 19982.0 | 0.440430 | 3.150784 | -12.319951 | -1.640674 | 0.471536 | 2.543967 | 13.089269 |
| V3 | 20000.0 | 2.484699 | 3.388963 | -10.708139 | 0.206860 | 2.255786 | 4.566165 | 17.090919 |
| V4 | 20000.0 | -0.083152 | 3.431595 | -15.082052 | -2.347660 | -0.135241 | 2.130615 | 13.236381 |
| V5 | 20000.0 | -0.053752 | 2.104801 | -8.603361 | -1.535607 | -0.101952 | 1.340480 | 8.133797 |
| V6 | 20000.0 | -0.995443 | 2.040970 | -10.227147 | -2.347238 | -1.000515 | 0.380330 | 6.975847 |
| V7 | 20000.0 | -0.879325 | 1.761626 | -7.949681 | -2.030926 | -0.917179 | 0.223695 | 8.006091 |
| V8 | 20000.0 | -0.548195 | 3.295756 | -15.657561 | -2.642665 | -0.389085 | 1.722965 | 11.679495 |
| V9 | 20000.0 | -0.016808 | 2.160568 | -8.596313 | -1.494973 | -0.067597 | 1.409203 | 8.137580 |
| V10 | 20000.0 | -0.012998 | 2.193201 | -9.853957 | -1.411212 | 0.100973 | 1.477045 | 8.108472 |
| V11 | 20000.0 | -1.895393 | 3.124322 | -14.832058 | -3.922404 | -1.921237 | 0.118906 | 11.826433 |
| V12 | 20000.0 | 1.604825 | 2.930454 | -12.948007 | -0.396514 | 1.507841 | 3.571454 | 15.080698 |
| V13 | 20000.0 | 1.580486 | 2.874658 | -13.228247 | -0.223545 | 1.637185 | 3.459886 | 15.419616 |
| V14 | 20000.0 | -0.950632 | 1.789651 | -7.738593 | -2.170741 | -0.957163 | 0.270677 | 5.670664 |
| V15 | 20000.0 | -2.414993 | 3.354974 | -16.416606 | -4.415322 | -2.382617 | -0.359052 | 12.246455 |
| V16 | 20000.0 | -2.925225 | 4.221717 | -20.374158 | -5.634240 | -2.682705 | -0.095046 | 13.583212 |
| V17 | 20000.0 | -0.134261 | 3.345462 | -14.091184 | -2.215611 | -0.014580 | 2.068751 | 16.756432 |
| V18 | 20000.0 | 1.189347 | 2.592276 | -11.643994 | -0.403917 | 0.883398 | 2.571770 | 13.179863 |
| V19 | 20000.0 | 1.181808 | 3.396925 | -13.491784 | -1.050168 | 1.279061 | 3.493299 | 13.237742 |
| V20 | 20000.0 | 0.023608 | 3.669477 | -13.922659 | -2.432953 | 0.033415 | 2.512372 | 16.052339 |
| V21 | 20000.0 | -3.611252 | 3.567690 | -17.956231 | -5.930360 | -3.532888 | -1.265884 | 13.840473 |
| V22 | 20000.0 | 0.951835 | 1.651547 | -10.122095 | -0.118127 | 0.974687 | 2.025594 | 7.409856 |
| V23 | 20000.0 | -0.366116 | 4.031860 | -14.866128 | -3.098756 | -0.262093 | 2.451750 | 14.458734 |
| V24 | 20000.0 | 1.134389 | 3.912069 | -16.387147 | -1.468062 | 0.969048 | 3.545975 | 17.163291 |
| V25 | 20000.0 | -0.002186 | 2.016740 | -8.228266 | -1.365178 | 0.025050 | 1.397112 | 8.223389 |
| V26 | 20000.0 | 1.873785 | 3.435137 | -11.834271 | -0.337863 | 1.950531 | 4.130037 | 16.836410 |
| V27 | 20000.0 | -0.612413 | 4.368847 | -14.904939 | -3.652323 | -0.884894 | 2.189177 | 17.560404 |
| V28 | 20000.0 | -0.883218 | 1.917713 | -9.269489 | -2.171218 | -0.891073 | 0.375884 | 6.527643 |
| V29 | 20000.0 | -0.985625 | 2.684365 | -12.579469 | -2.787443 | -1.176181 | 0.629773 | 10.722055 |
| V30 | 20000.0 | -0.015534 | 3.005258 | -14.796047 | -1.867114 | 0.184346 | 2.036229 | 12.505812 |
| V31 | 20000.0 | 0.486842 | 3.461384 | -13.722760 | -1.817772 | 0.490304 | 2.730688 | 17.255090 |
| V32 | 20000.0 | 0.303799 | 5.500400 | -19.876502 | -3.420469 | 0.052073 | 3.761722 | 23.633187 |
| V33 | 20000.0 | 0.049825 | 3.575285 | -16.898353 | -2.242857 | -0.066249 | 2.255134 | 16.692486 |
| V34 | 20000.0 | -0.462702 | 3.183841 | -17.985094 | -2.136984 | -0.255008 | 1.436935 | 14.358213 |
| V35 | 20000.0 | 2.229620 | 2.937102 | -15.349803 | 0.336191 | 2.098633 | 4.064358 | 15.291065 |
| V36 | 20000.0 | 1.514809 | 3.800860 | -14.833178 | -0.943809 | 1.566526 | 3.983939 | 19.329576 |
| V37 | 20000.0 | 0.011316 | 1.788165 | -5.478350 | -1.255819 | -0.128435 | 1.175533 | 7.467006 |
| V38 | 20000.0 | -0.344025 | 3.948147 | -17.375002 | -2.987638 | -0.316849 | 2.279399 | 15.289923 |
| V39 | 20000.0 | 0.890653 | 1.753054 | -6.438880 | -0.272250 | 0.919261 | 2.057540 | 7.759877 |
| V40 | 20000.0 | -0.875630 | 3.012155 | -11.023935 | -2.940193 | -0.920806 | 1.119897 | 10.654265 |
| Target | 20000.0 | 0.055500 | 0.228959 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000 |
df.Target.value_counts()
0 18890 1 1110 Name: Target, dtype: int64
df.Target.value_counts().values[0]/df.Target.value_counts().values.sum()
0.9445
df.Target.value_counts().values[1]/df.Target.value_counts().values.sum()
0.0555
df.isnull().sum()[df.isnull().sum() > 0]
V1 18 V2 18 dtype: int64
# function to plot a boxplot and a histogram along the same scale.
def histogram_boxplot(data, feature, figsize=(12, 7), kde=False, bins=None):
"""
Boxplot and histogram combined
data: dataframe
feature: dataframe column
figsize: size of figure (default (12,7))
kde: whether to the show density curve (default False)
bins: number of bins for histogram (default None)
"""
f2, (ax_box2, ax_hist2) = plt.subplots(
nrows=2, # Number of rows of the subplot grid= 2
sharex=True, # x-axis will be shared among all subplots
gridspec_kw={"height_ratios": (0.25, 0.75)},
figsize=figsize,
) # creating the 2 subplots
sns.boxplot(
data=data, x=feature, ax=ax_box2, showmeans=True, color="violet"
) # boxplot will be created and a star will indicate the mean value of the column
sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2, bins=bins, palette="winter"
) if bins else sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2, hue = 'Target'
) # For histogram
ax_hist2.axvline(
data[feature].mean(), color="green", linestyle="--"
) # Add mean to the histogram
ax_hist2.axvline(
data[feature].median(), color="black", linestyle="-"
) # Add median to the histogram
for feature in df.columns:
histogram_boxplot(df, feature, figsize=(12, 7), kde=False, bins=None) ## Please change the dataframe name as you define while reading the data
plt.figure(figsize=(25,20))
sns.heatmap(df.corr(), annot=True, vmin=-1, vmax=1, fmt=".2f", cmap="Spectral")
<AxesSubplot: >
Cheaking duplicates
df.duplicated().sum()
0
No duplicate values
df.isnull().sum()[df.isnull().sum() > 0]
V1 18 V2 18 dtype: int64
imputation_dict = {'V1': df.V1.median(), 'V2' : df.V2.mean()}
df.fillna(imputation_dict, inplace= True)
df.isnull().sum()
V1 0 V2 0 V3 0 V4 0 V5 0 V6 0 V7 0 V8 0 V9 0 V10 0 V11 0 V12 0 V13 0 V14 0 V15 0 V16 0 V17 0 V18 0 V19 0 V20 0 V21 0 V22 0 V23 0 V24 0 V25 0 V26 0 V27 0 V28 0 V29 0 V30 0 V31 0 V32 0 V33 0 V34 0 V35 0 V36 0 V37 0 V38 0 V39 0 V40 0 Target 0 dtype: int64
Removing corelated values
correlation = df.corr()
#Finding the correlated features
highly_correlated_cols = set()
for i in range(len(correlation.columns)):
for j in range(i):
if abs(correlation.iloc[i, j]) > 0.8:
colname = correlation.columns[i]
highly_correlated_cols.add(colname)
highly_correlated_cols
{'V14', 'V15', 'V16', 'V21', 'V29', 'V32'}
#Droping the corelated features
df.drop(highly_correlated_cols, axis = 1, inplace= True)
#again checking the correlation
plt.figure(figsize=(25,20))
sns.heatmap(df.corr(), annot=True, vmin=-1, vmax=1, fmt=".2f", cmap="Spectral")
<AxesSubplot: >
outlier treatment
#creating a function for outlier treatment. Here using vinserisation technique
def outlier_tretment(df,column):
cap = df[column].quantile(.75) + (1.5 * (df[column].quantile(.75) - df[column].quantile(.25)))
floor = df[column].quantile(.25) - (1.5 * (df[column].quantile(.75) - df[column].quantile(.25)))
df.loc[df[column] >= cap,column] = cap
df.loc[df[column] <= floor,column] = floor
print('{} number ourliers are floored to {} and {} number ourliers are capped to {} for {}'.format(sum(df[column] >= cap),cap,sum(df[column] <= floor),floor,column))
return df
df.columns.drop('Target')
Index(['V1', 'V2', 'V3', 'V4', 'V5', 'V6', 'V7', 'V8', 'V9', 'V10', 'V11',
'V12', 'V13', 'V17', 'V18', 'V19', 'V20', 'V22', 'V23', 'V24', 'V25',
'V26', 'V27', 'V28', 'V30', 'V31', 'V33', 'V34', 'V35', 'V36', 'V37',
'V38', 'V39', 'V40'],
dtype='object')
#Doing the Outlier treatment for all columns
for i in df.columns.drop('Target'):
df = outlier_tretment(df,i)
df
204 number ourliers are floored to 8.697038993 and 10 number ourliers are capped to -9.595468691 85 number ourliers are floored to 8.812472241125 and 98 number ourliers are capped to -7.907372703875001 215 number ourliers are floored to 11.105121678 and 60 number ourliers are capped to -6.332096354 139 number ourliers are floored to 8.848026312624999 and 89 number ourliers are capped to -9.065071370375 81 number ourliers are floored to 5.654610273874999 and 32 number ourliers are capped to -5.849736979125 65 number ourliers are floored to 4.471681297875 and 90 number ourliers are capped to -6.4385896071249995 203 number ourliers are floored to 3.6056262381250006 and 88 number ourliers are capped to -5.412857980875001 32 number ourliers are floored to 8.271409967 and 159 number ourliers are capped to -9.191109786999998 93 number ourliers are floored to 5.76546797325 and 55 number ourliers are capped to -5.85123781075 53 number ourliers are floored to 5.809430781 and 161 number ourliers are capped to -5.743597170999999 136 number ourliers are floored to 6.180870850124999 and 122 number ourliers are capped to -9.984368858875 82 number ourliers are floored to 9.523405883875 and 66 number ourliers are capped to -6.348465309124999 108 number ourliers are floored to 8.985032979875 and 195 number ourliers are capped to -5.748692261125 99 number ourliers are floored to 8.495293339749999 and 197 number ourliers are capped to -8.642152822249999 480 number ourliers are floored to 7.035300772125 and 251 number ourliers are capped to -4.8674477688749995 42 number ourliers are floored to 10.308498732124999 and 107 number ourliers are capped to -7.865368428875 66 number ourliers are floored to 9.930359781625 and 87 number ourliers are capped to -9.850940157375 89 number ourliers are floored to 5.24117474725 and 156 number ourliers are capped to -3.33370782275 35 number ourliers are floored to 10.777507935374999 and 59 number ourliers are capped to -11.424514369625 221 number ourliers are floored to 11.067031360749999 and 86 number ourliers are capped to -8.989118585249999 40 number ourliers are floored to 5.54054729775 and 74 number ourliers are capped to -5.50861358825 99 number ourliers are floored to 10.83188680575 and 143 number ourliers are capped to -7.039712204250001 155 number ourliers are floored to 10.951426229875 and 23 number ourliers are capped to -12.414572493125 106 number ourliers are floored to 4.196537342625 and 75 number ourliers are capped to -5.991870726375 34 number ourliers are floored to 7.891242076125001 and 212 number ourliers are capped to -7.722126908875 118 number ourliers are floored to 9.553376783500001 and 111 number ourliers are capped to -8.6404610585 225 number ourliers are floored to 9.002119060375 and 158 number ourliers are capped to -8.989842008624999 249 number ourliers are floored to 6.7978123625 and 554 number ourliers are capped to -7.497861389500001 182 number ourliers are floored to 9.656608620000002 and 133 number ourliers are capped to -5.256060092000001 127 number ourliers are floored to 11.37556147175 and 134 number ourliers are capped to -8.33543197025 134 number ourliers are floored to 4.8225600135 and 6 number ourliers are capped to -4.9028459125 81 number ourliers are floored to 10.179954516625 and 84 number ourliers are capped to -10.888193428375 85 number ourliers are floored to 5.55222571925 and 110 number ourliers are capped to -3.76693592675 91 number ourliers are floored to 7.2100333027499985 and 46 number ourliers are capped to -9.030329529249999
| V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 | ... | V31 | V33 | V34 | V35 | V36 | V37 | V38 | V39 | V40 | Target | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -4.464606 | -4.679129 | 3.101546 | 0.506130 | -0.221083 | -2.032511 | -2.910870 | 0.050714 | -1.522351 | 3.761892 | ... | 1.667098 | -1.690440 | 2.846296 | 2.235198 | 6.667486 | 0.443809 | -2.369169 | 2.950578 | -3.480324 | 0 |
| 1 | 3.365912 | 3.653381 | 0.909671 | -1.367528 | 0.332016 | 2.358938 | 0.732600 | -4.332135 | 0.565695 | -0.101080 | ... | 0.024883 | 3.032780 | -2.467514 | 1.894599 | -2.297780 | -1.731048 | 5.908837 | -0.386345 | 0.616242 | 0 |
| 2 | -3.831843 | -5.824444 | 0.634031 | -2.418815 | -1.773827 | 1.016824 | -2.098941 | -3.173204 | -2.081860 | 5.392621 | ... | -1.600395 | 0.803550 | 4.086219 | 2.292138 | 5.360850 | 0.351993 | 2.940021 | 3.839160 | -4.309402 | 0 |
| 3 | 1.618098 | 1.888342 | 7.046143 | -1.147285 | 0.083080 | -1.529780 | 0.207309 | -2.493629 | 0.344926 | 2.118578 | ... | 4.948770 | -2.577474 | 1.363769 | 0.622714 | 5.550100 | -1.526796 | 0.138853 | 3.101430 | -1.277378 | 0 |
| 4 | -0.111440 | 3.872488 | -3.758361 | -2.982897 | 3.792714 | 0.544960 | 0.205433 | 4.848994 | -1.854920 | -5.743597 | ... | 2.044184 | 6.629213 | -7.497861 | 1.222987 | -3.229763 | 1.686909 | -2.163896 | -3.644622 | 6.510338 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 19995 | -2.071318 | -1.088279 | -0.796174 | -3.011720 | -2.287540 | 2.807310 | 0.481428 | 0.105171 | -0.586599 | -2.899398 | ... | -3.938493 | 5.745013 | 0.589014 | -0.649988 | -3.043174 | 2.216461 | 0.608723 | 0.178193 | 2.927755 | 1 |
| 19996 | 2.890264 | 2.483069 | 5.643919 | 0.937053 | -1.380870 | 0.412051 | -1.593386 | -5.762498 | 2.150096 | 0.272302 | ... | -1.088553 | 1.181466 | -0.742412 | 5.368979 | -0.693028 | -1.668971 | 3.659954 | 0.819863 | -1.987265 | 0 |
| 19997 | -3.896979 | -3.942407 | -0.351364 | -2.417462 | 1.107546 | -1.527623 | -3.519882 | 2.054792 | -0.233996 | -0.357687 | ... | 0.981858 | 1.476080 | -3.953710 | 1.855555 | 5.029209 | 2.082588 | -6.409304 | 1.477138 | -0.874148 | 0 |
| 19998 | -3.187322 | -7.907373 | 5.695955 | -4.370053 | -5.354758 | -1.873044 | -3.947210 | 0.679420 | -2.389254 | 5.456756 | ... | 1.914766 | 3.163661 | 3.752095 | 8.529894 | 8.450626 | 0.203958 | -7.129918 | 4.249394 | -6.112267 | 0 |
| 19999 | -2.686903 | 1.961187 | 6.137088 | 2.600133 | 2.657241 | -4.290882 | -2.344267 | 0.974004 | -1.027462 | 0.497421 | ... | 4.674280 | -1.988786 | -1.348901 | 3.951801 | 5.449706 | -0.455411 | -2.202056 | 1.678229 | -1.974413 | 0 |
20000 rows × 35 columns
# outlier detection using boxplot
plt.figure(figsize=(15, 10))
for i, variable in enumerate(df.columns.drop('Target')):
plt.subplot(7, 5, i + 1)
sns.boxplot(data=df, x=variable)
plt.tight_layout(pad=2)
plt.show()
All outliers are removed
plt.figure(figsize=(15, 10))
for i, variable in enumerate(df.columns.drop('Target')):
plt.subplot(7, 5, i + 1)
sns.histplot(data=df, x=variable)
plt.tight_layout(pad=2)
plt.show()
Splitting data in to train validation and test
X_train, X_test, y_train, y_test = train_test_split(df.drop('Target',axis = 1), df['Target'], train_size= .85, random_state= 10)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, train_size= .8, random_state= 10)
X_train.shape
(13600, 34)
X_val.shape
(3400, 34)
X_test.shape
(3000, 34)
The nature of predictions made by the classification model will translate as follows:
Which metric to optimize?
Let's define a function to output different metrics (including recall) on the train and test set and a function to show confusion matrix so that we do not have to use the same code repetitively while evaluating models.
# defining a function to compute different metrics to check performance of a classification model built using sklearn
def model_performance_classification_sklearn(model, predictors, target):
"""
Function to compute different metrics to check classification model performance
model: classifier
predictors: independent variables
target: dependent variable
"""
# predicting using the independent variables
pred = model.predict(predictors)
acc = accuracy_score(target, pred) # to compute Accuracy
recall = recall_score(target, pred) # to compute Recall
precision = precision_score(target, pred) # to compute Precision
f1 = f1_score(target, pred) # to compute F1-score
# creating a dataframe of metrics
df_perf = pd.DataFrame(
{
"Accuracy": acc,
"Recall": recall,
"Precision": precision,
"F1": f1
},
index=[0],
)
return df_perf
# Type of scoring used to compare parameter combinations
scorer = metrics.make_scorer(metrics.recall_score)
Selecting 6 models for comparison and appending to a list
models = [] # Empty list to store all the models
# Appending models into the list
models.append(("dtree", DecisionTreeClassifier(random_state=1)))
models.append(("logreg", LogisticRegression(random_state=1)))
models.append(("Randomforest", RandomForestClassifier(random_state=1)))
models.append(("Bagging_class", BaggingClassifier(base_estimator=DecisionTreeClassifier(random_state=1))))
models.append(("Adaboost_class", AdaBoostClassifier(base_estimator=DecisionTreeClassifier(random_state=1))))
models.append(("SVM", SVC(kernel='linear', random_state=1)))
Checking for the 6 models with given dataset
results1 = [] # Empty list to store all model's CV scores
names = [] # Empty list to store name of the models
# loop through all models to get the mean cross validated score
print("\n" "Cross-Validation performance on training dataset:" "\n")
for name, model in models:
kfold = StratifiedKFold(
n_splits=5, shuffle=True, random_state=1
) # Setting number of splits equal to 10
cv_result = cross_val_score(
estimator=model, X=X_train, y=y_train, scoring=scorer, cv=kfold
)
results1.append(cv_result)
names.append(name)
print("{}: {}".format(name, cv_result.mean()))
print("\n" "Validation Performance:" "\n")
for name, model in models:
model.fit(X_train, y_train)
scores = recall_score(y_val, model.predict(X_val))
print("{}: {}".format(name, scores))
Cross-Validation performance on training dataset: dtree: 0.7236842105263157 logreg: 0.5210526315789473 Randomforest: 0.7092105263157895 Bagging_class: 0.7105263157894738 Adaboost_class: 0.7263157894736842 SVM: 0.48552631578947364 Validation Performance: dtree: 0.6720430107526881 logreg: 0.44623655913978494 Randomforest: 0.6612903225806451 Bagging_class: 0.6182795698924731 Adaboost_class: 0.6559139784946236 SVM: 0.43010752688172044
# Synthetic Minority Over Sampling Technique
sm = SMOTE(sampling_strategy=1, k_neighbors=10, random_state=1)
X_train_over, y_train_over = sm.fit_resample(X_train, y_train)
results2 = [] # Empty list to store all model's CV scores
names = [] # Empty list to store name of the models
# loop through all models to get the mean cross validated score
print("\n" "Cross-Validation performance on training dataset:" "\n")
for name, model in models:
kfold = StratifiedKFold(
n_splits=5, shuffle=True, random_state=1
) # Setting number of splits equal to 5
cv_result = cross_val_score(
estimator=model, X=X_train_over, y=y_train_over, scoring=scorer, cv=kfold
)
results2.append(cv_result)
names.append(name)
print("{}: {}".format(name, cv_result.mean()))
print("\n" "Validation Performance:" "\n")
for name, model in models:
model.fit(X_train_over, y_train_over)
scores = recall_score(y_val, model.predict(X_val))
print("{}: {}".format(name, scores))
Cross-Validation performance on training dataset: dtree: 0.9611370716510905 logreg: 0.8812305295950156 Randomforest: 0.9758566978193148 Bagging_class: 0.9658878504672896 Adaboost_class: 0.9607476635514018 SVM: 0.8801401869158878 Validation Performance: dtree: 0.7580645161290323 logreg: 0.8387096774193549 Randomforest: 0.8279569892473119 Bagging_class: 0.7741935483870968 Adaboost_class: 0.7473118279569892 SVM: 0.8333333333333334
# Random undersampler for under sampling the data
rus = RandomUnderSampler(random_state=1, sampling_strategy=1)
X_train_un, y_train_un = rus.fit_resample(X_train, y_train)
results3 = [] # Empty list to store all model's CV scores
names = [] # Empty list to store name of the models
# loop through all models to get the mean cross validated score
print("\n" "Cross-Validation performance on training dataset:" "\n")
for name, model in models:
kfold = StratifiedKFold(
n_splits=5, shuffle=True, random_state=1
) # Setting number of splits equal to 5
cv_result = cross_val_score(
estimator=model, X=X_train_un, y=y_train_un, scoring=scorer, cv=kfold
)
results3.append(cv_result)
names.append(name)
print("{}: {}".format(name, cv_result.mean()))
print("\n" "Validation Performance:" "\n")
for name, model in models:
model.fit(X_train_un, y_train_un)
scores = recall_score(y_val, model.predict(X_val))
print("{}: {}".format(name, scores))
Cross-Validation performance on training dataset: dtree: 0.8486842105263157 logreg: 0.8631578947368421 Randomforest: 0.8934210526315789 Bagging_class: 0.8578947368421053 Adaboost_class: 0.844736842105263 SVM: 0.8605263157894736 Validation Performance: dtree: 0.8225806451612904 logreg: 0.8387096774193549 Randomforest: 0.8655913978494624 Bagging_class: 0.8333333333333334 Adaboost_class: 0.8279569892473119 SVM: 0.8333333333333334
Hyperparameter tuning can take a long time to run, so to avoid that time complexity - you can use the following grids, wherever required.
param_grid = { "n_estimators": np.arange(100,150,25), "learning_rate": [0.2, 0.05, 1], "subsample":[0.5,0.7], "max_features":[0.5,0.7] }
param_grid = { "n_estimators": [100, 150, 200], "learning_rate": [0.2, 0.05], "base_estimator": [DecisionTreeClassifier(max_depth=1, random_state=1), DecisionTreeClassifier(max_depth=2, random_state=1), DecisionTreeClassifier(max_depth=3, random_state=1), ] }
param_grid = { 'max_samples': [0.8,0.9,1], 'max_features': [0.7,0.8,0.9], 'n_estimators' : [30,50,70], }
param_grid = { "n_estimators": [200,250,300], "min_samples_leaf": np.arange(1, 4), "max_features": [np.arange(0.3, 0.6, 0.1),'sqrt'], "max_samples": np.arange(0.4, 0.7, 0.1) }
param_grid = { 'max_depth': np.arange(2,6), 'min_samples_leaf': [1, 4, 7], 'max_leaf_nodes' : [10, 15], 'min_impurity_decrease': [0.0001,0.001] }
param_grid = {'C': np.arange(0.1,1.1,0.1)}
param_grid={ 'n_estimators': [150, 200, 250], 'scale_pos_weight': [5,10], 'learning_rate': [0.1,0.2], 'gamma': [0,3,5], 'subsample': [0.8,0.9] }
Choosing the parameter grid for the selected models
param_grid_1 = {
'max_depth': np.arange(2,6),
'min_samples_leaf': [1, 4, 7],
'max_leaf_nodes' : [10, 15],
'min_impurity_decrease': [0.0001,0.001]
}
param_grid_2 = {'C': np.arange(0.1,1.1,0.1)}
param_grid_3 = {
"n_estimators": [200,250,300],
"min_samples_leaf": np.arange(1, 4),
"max_features": [np.arange(0.3, 0.6, 0.1),'sqrt'],
"max_samples": np.arange(0.4, 0.7, 0.1)
}
param_grid_4 = {
'max_samples': [0.8,0.9,1],
'max_features': [0.7,0.8,0.9],
'n_estimators' : [30,50,70],
}
param_grid_5 = {
"n_estimators": [100, 150, 200],
"learning_rate": [0.2, 0.05],
"base_estimator": [DecisionTreeClassifier(max_depth=1, random_state=1), DecisionTreeClassifier(max_depth=2, random_state=1), DecisionTreeClassifier(max_depth=3, random_state=1),
]
}
param_grid_6 = {
'C': [0.1, 1, 10], # Regularization parameter
'kernel': ['linear', 'rbf', 'poly'], # Kernel type
'degree': [2, 3, 4], # Degree of the polynomial kernel (only for 'poly' kernel)
'gamma': ['scale', 'auto'], # Kernel coefficient (only for 'rbf' and 'poly' kernels)
}
param_grid = [param_grid_1,param_grid_2,param_grid_3,param_grid_4,param_grid_5,param_grid_6]
Fine tuning the best three model selected using orginal dataset
models_rscv_1 = [] # Empty list to store all the models
# Appending models into the list
models_rscv_1.append((param_grid_1,"dtree", DecisionTreeClassifier(random_state=1)))
models_rscv_1.append((param_grid_3,"Randomforest", RandomForestClassifier(random_state=1)))
models_rscv_1.append((param_grid_5,"Adaboost_class", AdaBoostClassifier(base_estimator=DecisionTreeClassifier(random_state=1))))
best_model_normal = []
for param, name, model in models_rscv_1:
#Calling RandomizedSearchCV
randomized_cv = RandomizedSearchCV(estimator=model, param_distributions=param, n_iter=50, n_jobs = -1, scoring=scorer, cv=5, random_state=1)
#Fitting parameters in RandomizedSearchCV
randomized_cv.fit(X_train,y_train)
best_params_1 = randomized_cv.best_params_
best_model_1 = randomized_cv.best_estimator_
best_model_normal.append((name,best_model_1))
model_check = best_model_1.fit(X_train,y_train)
scores = recall_score(y_val, model_check.predict(X_val))
print(" The {} model's Best parameters are {} with CV score={}:" .format(name, randomized_cv.best_params_,randomized_cv.best_score_))
print("Validation score of {}: {}".format(name, scores))
The dtree model's Best parameters are {'min_samples_leaf': 1, 'min_impurity_decrease': 0.0001, 'max_leaf_nodes': 15, 'max_depth': 5} with CV score=0.525:
Validation score of dtree: 0.489247311827957
The Randomforest model's Best parameters are {'n_estimators': 300, 'min_samples_leaf': 1, 'max_samples': 0.6, 'max_features': 'sqrt'} with CV score=0.6881578947368421:
Validation score of Randomforest: 0.6397849462365591
/opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn(
The Adaboost_class model's Best parameters are {'n_estimators': 200, 'learning_rate': 0.2, 'base_estimator': DecisionTreeClassifier(max_depth=3, random_state=1)} with CV score=0.7592105263157894:
Validation score of Adaboost_class: 0.7096774193548387
models_rscv_2 = [] # Empty list to store all the models
# Appending models into the list
models_rscv_2.append((param_grid_3,"Randomforest", RandomForestClassifier(random_state=1)))
models_rscv_2.append((param_grid_2,"logreg", LogisticRegression(random_state=1)))
models_rscv_2.append((param_grid_6,"SVM", SVC(random_state=1)))
Fine tuning the best three model selected using Oversampled dataset
best_model_over = []
for param, name, model in models_rscv_2:
#Calling RandomizedSearchCV
randomized_cv = RandomizedSearchCV(estimator=model, param_distributions=param, n_iter=50, n_jobs = -1, scoring=scorer, cv=5, random_state=1)
#Fitting parameters in RandomizedSearchCV
randomized_cv.fit(X_train,y_train)
best_params_1 = randomized_cv.best_params_
best_model_1 = randomized_cv.best_estimator_
best_model_over.append((name,best_model_1))
model_check = best_model_1.fit(X_train,y_train)
scores = recall_score(y_val, model_check.predict(X_val))
print(" The {} model's Best parameters are {} with CV score={}:" .format(name, randomized_cv.best_params_,randomized_cv.best_score_))
print("Validation score of {}: {}".format(name, scores))
The Randomforest model's Best parameters are {'n_estimators': 300, 'min_samples_leaf': 1, 'max_samples': 0.6, 'max_features': 'sqrt'} with CV score=0.6881578947368421:
Validation score of Randomforest: 0.6397849462365591
The logreg model's Best parameters are {'C': 0.1} with CV score=0.5131578947368421:
Validation score of logreg: 0.44623655913978494
The SVM model's Best parameters are {'kernel': 'rbf', 'gamma': 'scale', 'degree': 4, 'C': 10} with CV score=0.8605263157894736:
Validation score of SVM: 0.8225806451612904
models_rscv_3 = [] # Empty list to store all the models
# Appending models into the list
models_rscv_3.append((param_grid_2,"logreg", LogisticRegression(random_state=1)))
models_rscv_3.append((param_grid_3,"Randomforest", RandomForestClassifier(random_state=1)))
models_rscv_3.append((param_grid_4,"Bagging_class", BaggingClassifier(base_estimator=DecisionTreeClassifier(random_state=1))))
Fine tuning the best three model selected using Undersampled dataset
best_model_under = []
for param, name, model in models_rscv_3:
#Calling RandomizedSearchCV
randomized_cv = RandomizedSearchCV(estimator=model, param_distributions=param, n_iter=50, n_jobs = -1, scoring=scorer, cv=5, random_state=1)
#Fitting parameters in RandomizedSearchCV
randomized_cv.fit(X_train_un,y_train_un)
best_params_1 = randomized_cv.best_params_
best_model_1 = randomized_cv.best_estimator_
best_model_under.append((name,best_model_1))
model_check = best_model_1.fit(X_train_un,y_train_un)
scores = recall_score(y_val, model_check.predict(X_val))
print(" The {} model's Best parameters are {} with CV score={}:" .format(name, randomized_cv.best_params_,randomized_cv.best_score_))
print("Validation score of {}: {}".format(name, scores))
The logreg model's Best parameters are {'C': 0.30000000000000004} with CV score=0.868421052631579:
Validation score of logreg: 0.8387096774193549
The Randomforest model's Best parameters are {'n_estimators': 300, 'min_samples_leaf': 1, 'max_samples': 0.6, 'max_features': 'sqrt'} with CV score=0.8960526315789472:
Validation score of Randomforest: 0.8655913978494624
/opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn( /opt/homebrew/lib/python3.10/site-packages/sklearn/ensemble/_base.py:166: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4. warnings.warn(
The Bagging_class model's Best parameters are {'n_estimators': 70, 'max_samples': 0.8, 'max_features': 0.8} with CV score=0.8907894736842106:
Validation score of Bagging_class: 0.8548387096774194
# defining a function to compute different metrics to check performance of a classification model built using sklearn
def model_performance_classification_sklearn(models, predictors, target):
"""
Function to compute different metrics to check classification model performance
model: classifier
predictors: independent variables
target: dependent variable
"""
# predicting using the independent variables
accuracy_list = []
recall_list = []
precision_list = []
f1_score_list = []
index_list = []
for i in range (len(models)):
pred = models[i].predict(predictors)
acc = accuracy_score(target, pred) # to compute Accuracy
accuracy_list.append(acc)
recall = recall_score(target, pred) # to compute Recall
recall_list.append(recall)
precision = precision_score(target, pred) # to compute Precision
precision_list.append(precision)
f1 = f1_score(target, pred) # to compute F1-score
f1_score_list.append(f1)
model_name = 'Model_' + str(i)
index_list.append(model_name)
# creating a dataframe of metrics
df_perf = pd.DataFrame(
{
"Accuracy": accuracy_list,
"Recall": recall_list,
"Precision": precision_list,
"F1": f1_score_list
},
index= index_list,
)
return df_perf
all_9_models = best_model_normal + best_model_over + best_model_under
all_9_models[0][1]
DecisionTreeClassifier(max_depth=5, max_leaf_nodes=15,
min_impurity_decrease=0.0001, random_state=1)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. DecisionTreeClassifier(max_depth=5, max_leaf_nodes=15,
min_impurity_decrease=0.0001, random_state=1)all_9_models_only = [i[1] for i in all_9_models]
all_9_models_only
[DecisionTreeClassifier(max_depth=5, max_leaf_nodes=15,
min_impurity_decrease=0.0001, random_state=1),
RandomForestClassifier(max_samples=0.6, n_estimators=300, random_state=1),
AdaBoostClassifier(base_estimator=DecisionTreeClassifier(max_depth=3,
random_state=1),
learning_rate=0.2, n_estimators=200),
RandomForestClassifier(max_samples=0.6, n_estimators=300, random_state=1),
LogisticRegression(C=0.1, random_state=1),
SVC(C=10, degree=4, random_state=1),
LogisticRegression(C=0.30000000000000004, random_state=1),
RandomForestClassifier(max_samples=0.6, n_estimators=300, random_state=1),
BaggingClassifier(base_estimator=DecisionTreeClassifier(random_state=1),
max_features=0.8, max_samples=0.8, n_estimators=70)]
model_performance_classification_sklearn(all_9_models_only, X_train, y_train)
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| Model_0 | 0.974779 | 0.590789 | 0.933472 | 0.723610 |
| Model_1 | 0.994706 | 0.907895 | 0.997110 | 0.950413 |
| Model_2 | 0.999412 | 0.989474 | 1.000000 | 0.994709 |
| Model_3 | 0.994706 | 0.907895 | 0.997110 | 0.950413 |
| Model_4 | 0.968750 | 0.517105 | 0.871397 | 0.649050 |
| Model_5 | 0.993971 | 0.893421 | 0.998529 | 0.943056 |
| Model_6 | 0.865515 | 0.865789 | 0.275891 | 0.418442 |
| Model_7 | 0.947574 | 0.976316 | 0.516354 | 0.675467 |
| Model_8 | 0.948603 | 0.994737 | 0.521020 | 0.683853 |
model_performance_classification_sklearn(all_9_models_only, X_val, y_val)
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| Model_0 | 0.965588 | 0.489247 | 0.805310 | 0.608696 |
| Model_1 | 0.979118 | 0.639785 | 0.967480 | 0.770227 |
| Model_2 | 0.981176 | 0.709677 | 0.929577 | 0.804878 |
| Model_3 | 0.979118 | 0.639785 | 0.967480 | 0.770227 |
| Model_4 | 0.963529 | 0.446237 | 0.798077 | 0.572414 |
| Model_5 | 0.989412 | 0.822581 | 0.980769 | 0.894737 |
| Model_6 | 0.858529 | 0.838710 | 0.257002 | 0.393443 |
| Model_7 | 0.936471 | 0.865591 | 0.457386 | 0.598513 |
| Model_8 | 0.933824 | 0.854839 | 0.445378 | 0.585635 |
model_performance_classification_sklearn(all_9_models_only, X_test, y_test)
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| Model_0 | 0.965667 | 0.475610 | 0.821053 | 0.602317 |
| Model_1 | 0.981000 | 0.658537 | 0.990826 | 0.791209 |
| Model_2 | 0.986333 | 0.756098 | 0.992000 | 0.858131 |
| Model_3 | 0.981000 | 0.658537 | 0.990826 | 0.791209 |
| Model_4 | 0.966000 | 0.463415 | 0.844444 | 0.598425 |
| Model_5 | 0.991000 | 0.841463 | 0.992806 | 0.910891 |
| Model_6 | 0.866667 | 0.829268 | 0.267717 | 0.404762 |
| Model_7 | 0.945333 | 0.896341 | 0.500000 | 0.641921 |
| Model_8 | 0.944333 | 0.890244 | 0.494915 | 0.636166 |
df_test = pd.read_csv('/Users/anshamohammed/Desktop/Drive G/specialised course/Feature_eng/Project/Test.csv.csv')
df_test
| V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 | ... | V32 | V33 | V34 | V35 | V36 | V37 | V38 | V39 | V40 | Target | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -0.613489 | -3.819640 | 2.202302 | 1.300420 | -1.184929 | -4.495964 | -1.835817 | 4.722989 | 1.206140 | -0.341909 | ... | 2.291204 | -5.411388 | 0.870073 | 0.574479 | 4.157191 | 1.428093 | -10.511342 | 0.454664 | -1.448363 | 0 |
| 1 | 0.389608 | -0.512341 | 0.527053 | -2.576776 | -1.016766 | 2.235112 | -0.441301 | -4.405744 | -0.332869 | 1.966794 | ... | -2.474936 | 2.493582 | 0.315165 | 2.059288 | 0.683859 | -0.485452 | 5.128350 | 1.720744 | -1.488235 | 0 |
| 2 | -0.874861 | -0.640632 | 4.084202 | -1.590454 | 0.525855 | -1.957592 | -0.695367 | 1.347309 | -1.732348 | 0.466500 | ... | -1.318888 | -2.997464 | 0.459664 | 0.619774 | 5.631504 | 1.323512 | -1.752154 | 1.808302 | 1.675748 | 0 |
| 3 | 0.238384 | 1.458607 | 4.014528 | 2.534478 | 1.196987 | -3.117330 | -0.924035 | 0.269493 | 1.322436 | 0.702345 | ... | 3.517918 | -3.074085 | -0.284220 | 0.954576 | 3.029331 | -1.367198 | -3.412140 | 0.906000 | -2.450889 | 0 |
| 4 | 5.828225 | 2.768260 | -1.234530 | 2.809264 | -1.641648 | -1.406698 | 0.568643 | 0.965043 | 1.918379 | -2.774855 | ... | 1.773841 | -1.501573 | -2.226702 | 4.776830 | -6.559698 | -0.805551 | -0.276007 | -3.858207 | -0.537694 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 4995 | -5.120451 | 1.634804 | 1.251259 | 4.035944 | 3.291204 | -2.932230 | -1.328662 | 1.754066 | -2.984586 | 1.248633 | ... | 9.979118 | 0.063438 | 0.217281 | 3.036388 | 2.109323 | -0.557433 | 1.938718 | 0.512674 | -2.694194 | 0 |
| 4996 | -5.172498 | 1.171653 | 1.579105 | 1.219922 | 2.529627 | -0.668648 | -2.618321 | -2.000545 | 0.633791 | -0.578938 | ... | 4.423900 | 2.603811 | -2.152170 | 0.917401 | 2.156586 | 0.466963 | 0.470120 | 2.196756 | -2.376515 | 0 |
| 4997 | -1.114136 | -0.403576 | -1.764875 | -5.879475 | 3.571558 | 3.710802 | -2.482952 | -0.307614 | -0.921945 | -2.999141 | ... | 3.791778 | 7.481506 | -10.061396 | -0.387166 | 1.848509 | 1.818248 | -1.245633 | -1.260876 | 7.474682 | 0 |
| 4998 | -1.703241 | 0.614650 | 6.220503 | -0.104132 | 0.955916 | -3.278706 | -1.633855 | -0.103936 | 1.388152 | -1.065622 | ... | -4.100352 | -5.949325 | 0.550372 | -1.573640 | 6.823936 | 2.139307 | -4.036164 | 3.436051 | 0.579249 | 0 |
| 4999 | -0.603701 | 0.959550 | -0.720995 | 8.229574 | -1.815610 | -2.275547 | -2.574524 | -1.041479 | 4.129645 | -2.731288 | ... | 2.369776 | -1.062408 | 0.790772 | 4.951955 | -7.440825 | -0.069506 | -0.918083 | -2.291154 | -5.362891 | 0 |
5000 rows × 41 columns
df_test.isnull().sum()
V1 5 V2 6 V3 0 V4 0 V5 0 V6 0 V7 0 V8 0 V9 0 V10 0 V11 0 V12 0 V13 0 V14 0 V15 0 V16 0 V17 0 V18 0 V19 0 V20 0 V21 0 V22 0 V23 0 V24 0 V25 0 V26 0 V27 0 V28 0 V29 0 V30 0 V31 0 V32 0 V33 0 V34 0 V35 0 V36 0 V37 0 V38 0 V39 0 V40 0 Target 0 dtype: int64
imputation_dict = {'V1': df_test.V1.median(), 'V2' : df_test.V2.mean()}
df_test.fillna(imputation_dict, inplace= True)
df_test.drop(highly_correlated_cols, axis = 1, inplace= True)
X = df_test.drop('Target', axis = 1)
y = df_test['Target']
model_performance_classification_sklearn(all_9_models_only, X, y)
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| Model_0 | 0.9648 | 0.485816 | 0.815476 | 0.608889 |
| Model_1 | 0.9790 | 0.645390 | 0.973262 | 0.776119 |
| Model_2 | 0.9850 | 0.762411 | 0.964126 | 0.851485 |
| Model_3 | 0.9790 | 0.645390 | 0.973262 | 0.776119 |
| Model_4 | 0.9648 | 0.471631 | 0.831250 | 0.601810 |
| Model_5 | 0.9904 | 0.843972 | 0.983471 | 0.908397 |
| Model_6 | 0.9648 | 0.471631 | 0.831250 | 0.601810 |
| Model_7 | 0.9412 | 0.875887 | 0.488142 | 0.626904 |
| Model_8 | 0.9412 | 0.872340 | 0.488095 | 0.625954 |
# to create pipeline and make_pipeline
from sklearn.pipeline import Pipeline, make_pipeline
#Selecting the best Model
all_9_models_only[5]
SVC(C=10, degree=4, random_state=1)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
SVC(C=10, degree=4, random_state=1)
#Making imputer pipeline
numeric_transformer1 = Pipeline(steps=[("imputer", SimpleImputer(strategy="median"))])
numeric_transformer2 = Pipeline(steps=[("imputer", SimpleImputer(strategy="mean"))])
def drop_cols (df, highly_correlated_cols):
return df.drop(highly_correlated_cols, axis = 1).transform(df)
#Making imputer Column transformer
preprocessor = ColumnTransformer(
transformers=[
("num1", numeric_transformer1, ['V1']),
("num2", numeric_transformer2, ['V2']),
],remainder="passthrough",)
preprocessor
ColumnTransformer(remainder='passthrough',
transformers=[('num1',
Pipeline(steps=[('imputer',
SimpleImputer(strategy='median'))]),
['V1']),
('num2',
Pipeline(steps=[('imputer', SimpleImputer())]),
['V2'])])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. ColumnTransformer(remainder='passthrough',
transformers=[('num1',
Pipeline(steps=[('imputer',
SimpleImputer(strategy='median'))]),
['V1']),
('num2',
Pipeline(steps=[('imputer', SimpleImputer())]),
['V2'])])['V1']
SimpleImputer(strategy='median')
['V2']
SimpleImputer()
passthrough
# Creating new pipeline with best parameters
pipe = Pipeline(
steps=[("pre", preprocessor),
("LGR",all_9_models_only[5])])
# Fit the model on training data
pipe.fit(X_train, y_train)
Pipeline(steps=[('pre',
ColumnTransformer(remainder='passthrough',
transformers=[('num1',
Pipeline(steps=[('imputer',
SimpleImputer(strategy='median'))]),
['V1']),
('num2',
Pipeline(steps=[('imputer',
SimpleImputer())]),
['V2'])])),
('LGR', SVC(C=10, degree=4, random_state=1))])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. Pipeline(steps=[('pre',
ColumnTransformer(remainder='passthrough',
transformers=[('num1',
Pipeline(steps=[('imputer',
SimpleImputer(strategy='median'))]),
['V1']),
('num2',
Pipeline(steps=[('imputer',
SimpleImputer())]),
['V2'])])),
('LGR', SVC(C=10, degree=4, random_state=1))])ColumnTransformer(remainder='passthrough',
transformers=[('num1',
Pipeline(steps=[('imputer',
SimpleImputer(strategy='median'))]),
['V1']),
('num2',
Pipeline(steps=[('imputer', SimpleImputer())]),
['V2'])])['V1']
SimpleImputer(strategy='median')
['V2']
SimpleImputer()
['V3', 'V4', 'V5', 'V6', 'V7', 'V8', 'V9', 'V10', 'V11', 'V12', 'V13', 'V17', 'V18', 'V19', 'V20', 'V22', 'V23', 'V24', 'V25', 'V26', 'V27', 'V28', 'V30', 'V31', 'V33', 'V34', 'V35', 'V36', 'V37', 'V38', 'V39', 'V40']
passthrough
SVC(C=10, degree=4, random_state=1)
recall = recall_score(y, pipe.predict(X)) # to compute Recall
recall
0.8439716312056738